The following details differential gene expression and network analysis of: Enhanced cortical neural stem cell identity through short SMAD/WNT inhibition in human cerebral organoids facilitates emergence of outer radial glial cells.
The authors of this paper investigate a novel culture protocol for 3D Brain Organoids which is thought to enhance the emergence of outer radial glia (oRGs). These cells are a vital component involved in cortical expansions and have been implicated in a variety of neurdevelopmental disorders (NDDs). Feel free to read more here: Molecular identity of human outer radial glia during cortical development..
The data I used ended up being has 33 samples in total which vary by age and culture protocol. The ages are Day 0 (control, n=3), Day 30 (n = 30). There are 5 protocols being compared in this study hESC (controls), Inhibitor Free (IF), WNT inhibition (WNT), Dual SMAD inhibition (Dual), and Triple Inhibition (Triple).
Here is a schematic of the data:
In assignment 1, I worked to clean, organize and normalize my data. The data was in an excel file as RPKM and was downloaded from GEO, accesion number: GSE189981. the inital bulk sequencing table had multiple column headings denoting the various conditions and cell lines. I cleaned up the headers in that matrix and aligned them with the headers in the sample information table. The genes were already denoted as HGNC, however a few gene symbols were missing. I was able to map 14 additional genes and eliminated 9 unmappable genes. From there, I normalized my data, and verified my normalization with density plots and boxplots. This final, normalized, clean dataframe is my input for this Assignment, alongside a sample information dataframe.
I used the final cleaned dataframe as well as the sample info table from assignment 1 to start assignment 2. Using this data, I first did variance analysis which indicated that I should primarly look at differentiating the organoids by protocol as opposed to age. Based on this, I created a design matrix for my 4 protocols with the iPSC stage being the reference. I then performed differential gene expression analysis with edgeR and fit the results to a general linear model (GLM). I thresholded these results based on FDR < 0.05 and LFC >1 and then proceeded to perform a thresholded over-representation analysis using g:Profiler. I completed 2 analysis workflows. The first is for a total set of genes for all the protocols in comparison to the iPSC stage. This is an organoid level analysis. The second is a unique protocol analysis which allowed me to compare protocol-specific differential expression. This resulted in an output for top overall pathways and top protocol specific pathway. I continue this paradigm here.
The WNT and IF protocols did not return a significant number of unique pathways (5 positive, 1 negative), so it was excluded from the analysis.
knitr::opts_chunk$set(echo = TRUE)
# Install Bioconductor
if (!requireNamespace("BiocManager", quietly = TRUE)){
install.packages("BiocManager")}
# Install fgsea
if(! requireNamespace("fgsea")){
install.packages("fgsea")}
library(fgsea)
# Loading the thresholded analysis files
GO_upregulated <- readRDS("./gsea_inputs/GO_Upregulated.RDS")
GO_downregulated <- readRDS("./gsea_inputs/GO_downregulated.RDS")
# Loading DGE files
dual <- readRDS("./gsea_inputs/Dual.RDS")
IF <- readRDS("./gsea_inputs/IF.rds")
triple <-readRDS("./gsea_inputs/Triple.rds")
# These are the results for all protocols vs iPSCs
all <-readRDS("./gsea_inputs/all.rds")
# Making a protocol list
protocols = list(dual = dual, IF = IF, triple = triple, all = all)
for (i in names(protocols)) {
# Extracting the current df
working_df <- protocols[[i]]
# Calculating ranks based on logFC and P-value
rank_df <- data.frame(Name = rownames(working_df),
Rank = sign(working_df$logFC) * -log10(working_df$PValue))
# Ordering the calculated ranks
rank_df <- rank_df[order(rank_df$Rank),]
## Making my ranks into a named vector so I can use it with fgsea
rank_stats <- rank_df$Rank
names(rank_stats) <- rank_df$Name
# Outputting individual rank and stats object for every protocol
assign(paste0(i, "_stats"), rank_stats)
assign(paste0(i, "_rank"), rank_df)
# Removing extras
rm(working_df, rank_df, rank_stats, i)
}
# Saving my gene set location
gene_set_file <- "Human_GOBP_AllPathways_noPFOCR_no_GO_iea_March_01_2024_symbol.gmt"
# Loading my gene set
gene_set <- gmtPathways(gene_set_file)
# Running fgsea for 3 protocols and all organoids
all_gsea <-fgsea(pathways = gene_set,
stats = all_stats,
minSize = 15,
maxSize = 700,
nperm = 2000)
dual_gsea <-fgsea(pathways = gene_set,
stats = dual_stats,
minSize = 15,
maxSize = 700,
nperm = 2000)
IF_gsea <- fgsea(pathways = gene_set,
stats = IF_stats,
minSize = 15,
maxSize = 700,
nperm = 2000)
triple_gsea <- fgsea(pathways = gene_set,
stats = triple_stats,
minSize = 15,
maxSize = 700,
nperm = 2000)
I selected a range of 15 - 700 terms given that important neural processes involve many genes and may be missed in an narrower annotation. Additionally, my later overlapping gene exclusion will ensure any overly large terms will be excluded regardless.
# Rerformatting code courtesy of Dr Isserlin, to change fgsea output to GSEA output
format_fgsea_results<- function(current_fgsea_results, current_ranks ){
#calculating the rank at max
#fgsea returns the leading edge. Just need to extract the highest rank from
## set to get the rank at max
calculated_rank_at_max <- apply(current_fgsea_results,1,FUN=function(x){
max(which(names(current_ranks) %in% unlist(x[8])))})
#The last column is a comma separated list of genes that are found in the leading edge
gsea_results <- cbind(current_fgsea_results$pathway,
current_fgsea_results$pathway,
"Details",
current_fgsea_results$size,
current_fgsea_results$ES,
current_fgsea_results$NES,
current_fgsea_results$pval,
current_fgsea_results$padj,
0,
calculated_rank_at_max,
apply(current_fgsea_results,1,
FUN=function(x){paste(unlist(x[8]),collapse=",")}))
colnames(gsea_results) <-
c("NAME","description","GSdetails","SIZE","ES","NES","pval","padj","FWER","Rank at Max","leading edge genes")
return(gsea_results)
}
# Reformatting all my GSEA results
dual_gsea = as.data.frame(format_fgsea_results(dual_gsea,dual_stats))
all_gsea = as.data.frame(format_fgsea_results(all_gsea,all_stats))
IF_gsea = as.data.frame(format_fgsea_results(IF_gsea,IF_stats))
triple_gsea = as.data.frame(format_fgsea_results(triple_gsea,triple_stats))
# Storing my total number of pathways between 15 and 700
total_pathways <- length(gene_set[which(sapply(gene_set, length) > 14 & sapply(gene_set, length) < 701)])
# Parsing out my positive and negative genes based on normalized enrichment score (NES)
# Also filtering for significance
# for dual
dual_gsea_pos <- dual_gsea[which(dual_gsea$NES >0 & dual_gsea$padj < 0.05),]
dual_gsea_neg <- dual_gsea[which(dual_gsea$NES <0 & dual_gsea$padj < 0.05),]
# for triple
triple_gsea_pos <- triple_gsea[which(triple_gsea$NES >0 & triple_gsea$padj < 0.05),]
triple_gsea_neg <- triple_gsea[which(triple_gsea$NES <0 & triple_gsea$padj < 0.05),]
# for IF
IF_gsea_pos <- IF_gsea[which(IF_gsea$NES >0 & IF_gsea$padj < 0.05),]
IF_gsea_neg <- IF_gsea[which(IF_gsea$NES <0 & IF_gsea$padj < 0.05),]
# for all
all_gsea_pos <- all_gsea[which(all_gsea$NES >0 & all_gsea$padj < 0.05),]
all_gsea_neg <- all_gsea[which(all_gsea$NES <0 & all_gsea$padj < 0.05),]
There are 7044 total gene sets with 15
to 700 terms
# Summary stat table which gives me the number of genes pre and post filtration for every protocol
summary_stats <- data.frame(Dual = c(nrow(dual_gsea[which(dual_gsea$NES >0),]),
nrow(dual_gsea[which(dual_gsea$NES <0),]),
nrow(dual_gsea_pos),
nrow(dual_gsea_neg)),
Triple = c(nrow(triple_gsea[which(triple_gsea$NES >0),]),
nrow(triple_gsea[which(triple_gsea$NES <0),]),
nrow(triple_gsea_pos),
nrow(triple_gsea_neg)),
IF = c(nrow(IF_gsea[which(IF_gsea$NES >0),]),
nrow(IF_gsea[which(IF_gsea$NES <0),]),
nrow(IF_gsea_pos),
nrow(IF_gsea_neg)),
all = c(nrow(all_gsea[which(all_gsea$NES >0),]),
nrow(all_gsea[which(all_gsea$NES <0),]),
nrow(all_gsea_pos),
nrow(all_gsea_neg)),
row.names = c("Positive NES","Negative NES","Positive NES, FDR < 0.05","Negative NES, FDR < 0.05"))
kableExtra::kable_styling(knitr::kable(summary_stats,
caption = paste("Summary: There are",total_pathways,"total gene sets
between 15 and 700 terms, subsets for each
protocl are shown below",sep =" "),
font_size = 10))
| Dual | Triple | IF | all | |
|---|---|---|---|---|
| Positive NES | 3040 | 2865 | 3151 | 3705 |
| Negative NES | 3182 | 3357 | 3071 | 2517 |
| Positive NES, FDR < 0.05 | 196 | 339 | 182 | 259 |
| Negative NES, FDR < 0.05 | 571 | 935 | 759 | 551 |
# Positive summary table which gives me the number of genes pre and post filtration for every protocol
positive_summary_table <- data.frame(dual = dual_gsea_pos$NAME[1:10],
Triple = triple_gsea_pos$NAME[1:10],
IF = IF_gsea_pos$NAME[1:10],
all = all_gsea_pos$NAME[1:10])
negative_summary_table <- data.frame(dual = dual_gsea_neg$NAME[1:10],
Triple = triple_gsea_neg$NAME[1:10],
IF = IF_gsea_neg$NAME[1:10],
all = all_gsea_neg$NAME[1:10])
kableExtra::kable_styling(knitr::kable(positive_summary_table,
caption = "Summary: Top positive pathways by protocol"),
font_size = 10)
| dual | Triple | IF | all |
|---|---|---|---|
| PID_RXR_VDR_PATHWAY%MSIGDB_C2%PID_RXR_VDR_PATHWAY | ATP BIOSYNTHESIS%BIOCYC%PWY-7980 | HALLMARK_MYOGENESIS%MSIGDBHALLMARK%HALLMARK_MYOGENESIS | HALLMARK_MYOGENESIS%MSIGDBHALLMARK%HALLMARK_MYOGENESIS |
| CADHERIN SIGNALING PATHWAY%PANTHER PATHWAY%P00012 | REELIN SIGNALING PATHWAY%PATHWAY INTERACTION DATABASE NCI-NATURE CURATED DATA%REELIN SIGNALING PATHWAY | CADHERIN SIGNALING PATHWAY%PANTHER PATHWAY%P00012 | ATP BIOSYNTHESIS%BIOCYC%PWY-7980 |
| WNT SIGNALING PATHWAY%PANTHER PATHWAY%P00057 | ALZHEIMER DISEASE-AMYLOID SECRETASE PATHWAY%PANTHER PATHWAY%P00003 | WNT SIGNALING PATHWAY%PANTHER PATHWAY%P00057 | ALZHEIMER DISEASE-AMYLOID SECRETASE PATHWAY%PANTHER PATHWAY%P00003 |
| MEMBRANE TRAFFICKING%REACTOME DATABASE ID RELEASE 65%199991 | BETA1 ADRENERGIC RECEPTOR SIGNALING PATHWAY%PANTHER PATHWAY%P04377 | MEMBRANE TRAFFICKING%REACTOME DATABASE ID RELEASE 65%199991 | CADHERIN SIGNALING PATHWAY%PANTHER PATHWAY%P00012 |
| ACTIVATION OF NMDA RECEPTORS AND POSTSYNAPTIC EVENTS%REACTOME%R-HSA-442755.9 | BETA2 ADRENERGIC RECEPTOR SIGNALING PATHWAY%PANTHER PATHWAY%P04378 | ACTIVATION OF NMDA RECEPTORS AND POSTSYNAPTIC EVENTS%REACTOME%R-HSA-442755.9 | GABA-B_RECEPTOR_II_SIGNALING%PANTHER PATHWAY%P05731 |
| GABA SYNTHESIS, RELEASE, REUPTAKE AND DEGRADATION%REACTOME%R-HSA-888590.3 | CADHERIN SIGNALING PATHWAY%PANTHER PATHWAY%P00012 | SIGNALING BY NTRK2 (TRKB)%REACTOME DATABASE ID RELEASE 65%9006115 | METABOTROPIC GLUTAMATE RECEPTOR GROUP III PATHWAY%PANTHER PATHWAY%P00039 |
| LOSS OF PROTEINS REQUIRED FOR INTERPHASE MICROTUBULE ORGANIZATION FROM THE CENTROSOME%REACTOME%R-HSA-380284.3 | ENKEPHALIN RELEASE%PANTHER PATHWAY%P05913 | COPI-MEDIATED ANTEROGRADE TRANSPORT%REACTOME DATABASE ID RELEASE 65%6807878 | WNT SIGNALING PATHWAY%PANTHER PATHWAY%P00057 |
| INTRA-GOLGI AND RETROGRADE GOLGI-TO-ER TRAFFIC%REACTOME DATABASE ID RELEASE 65%6811442 | GABA-B_RECEPTOR_II_SIGNALING%PANTHER PATHWAY%P05731 | INTRA-GOLGI AND RETROGRADE GOLGI-TO-ER TRAFFIC%REACTOME DATABASE ID RELEASE 65%6811442 | MEMBRANE TRAFFICKING%REACTOME DATABASE ID RELEASE 65%199991 |
| THROMBOXANE SIGNALLING THROUGH TP RECEPTOR%REACTOME%R-HSA-428930.4 | HETEROTRIMERIC G-PROTEIN SIGNALING PATHWAY-GI ALPHA AND GS ALPHA MEDIATED PATHWAY%PANTHER PATHWAY%P00026 | ABC TRANSPORTERS IN LIPID HOMEOSTASIS%REACTOME DATABASE ID RELEASE 65%1369062 | ACTIVATION OF NMDA RECEPTORS AND POSTSYNAPTIC EVENTS%REACTOME%R-HSA-442755.9 |
| ADORA2B MEDIATED ANTI-INFLAMMATORY CYTOKINES PRODUCTION%REACTOME DATABASE ID RELEASE 65%9660821 | HISTAMINE H2 RECEPTOR MEDIATED SIGNALING PATHWAY%PANTHER PATHWAY%P04386 | TRANSMISSION ACROSS CHEMICAL SYNAPSES%REACTOME DATABASE ID RELEASE 65%112315 | COPI-MEDIATED ANTEROGRADE TRANSPORT%REACTOME DATABASE ID RELEASE 65%6807878 |
kableExtra::kable_styling(knitr::kable(negative_summary_table,
caption = "Summary: Top negative pathways by protocol"),
font_size = 10)
| dual | Triple | IF | all |
|---|---|---|---|
| HALLMARK_G2M_CHECKPOINT%MSIGDBHALLMARK%HALLMARK_G2M_CHECKPOINT | HALLMARK_G2M_CHECKPOINT%MSIGDBHALLMARK%HALLMARK_G2M_CHECKPOINT | HALLMARK_G2M_CHECKPOINT%MSIGDBHALLMARK%HALLMARK_G2M_CHECKPOINT | HALLMARK_INTERFERON_GAMMA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_GAMMA_RESPONSE |
| HALLMARK_COMPLEMENT%MSIGDBHALLMARK%HALLMARK_COMPLEMENT | HALLMARK_COMPLEMENT%MSIGDBHALLMARK%HALLMARK_COMPLEMENT | HALLMARK_E2F_TARGETS%MSIGDBHALLMARK%HALLMARK_E2F_TARGETS | HALLMARK_INTERFERON_ALPHA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_ALPHA_RESPONSE |
| HALLMARK_E2F_TARGETS%MSIGDBHALLMARK%HALLMARK_E2F_TARGETS | HALLMARK_E2F_TARGETS%MSIGDBHALLMARK%HALLMARK_E2F_TARGETS | HALLMARK_ALLOGRAFT_REJECTION%MSIGDBHALLMARK%HALLMARK_ALLOGRAFT_REJECTION | HALLMARK_ALLOGRAFT_REJECTION%MSIGDBHALLMARK%HALLMARK_ALLOGRAFT_REJECTION |
| HALLMARK_ALLOGRAFT_REJECTION%MSIGDBHALLMARK%HALLMARK_ALLOGRAFT_REJECTION | HALLMARK_INTERFERON_GAMMA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_GAMMA_RESPONSE | HALLMARK_UNFOLDED_PROTEIN_RESPONSE%MSIGDBHALLMARK%HALLMARK_UNFOLDED_PROTEIN_RESPONSE | HALLMARK_UNFOLDED_PROTEIN_RESPONSE%MSIGDBHALLMARK%HALLMARK_UNFOLDED_PROTEIN_RESPONSE |
| HALLMARK_UNFOLDED_PROTEIN_RESPONSE%MSIGDBHALLMARK%HALLMARK_UNFOLDED_PROTEIN_RESPONSE | HALLMARK_INTERFERON_ALPHA_RESPONSE%MSIGDBHALLMARK%HALLMARK_INTERFERON_ALPHA_RESPONSE | HALLMARK_MTORC1_SIGNALING%MSIGDBHALLMARK%HALLMARK_MTORC1_SIGNALING | HALLMARK_MYC_TARGETS_V2%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V2 |
| HALLMARK_MTORC1_SIGNALING%MSIGDBHALLMARK%HALLMARK_MTORC1_SIGNALING | HALLMARK_ALLOGRAFT_REJECTION%MSIGDBHALLMARK%HALLMARK_ALLOGRAFT_REJECTION | HALLMARK_ESTROGEN_RESPONSE_LATE%MSIGDBHALLMARK%HALLMARK_ESTROGEN_RESPONSE_LATE | AEROBIC RESPIRATION I (CYTOCHROME C)%BIOCYC%PWY-3781 |
| HALLMARK_OXIDATIVE_PHOSPHORYLATION%MSIGDBHALLMARK%HALLMARK_OXIDATIVE_PHOSPHORYLATION | HALLMARK_UNFOLDED_PROTEIN_RESPONSE%MSIGDBHALLMARK%HALLMARK_UNFOLDED_PROTEIN_RESPONSE | HALLMARK_MYC_TARGETS_V1%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V1 | PURINE NUCLEOTIDES <I>DE NOVO< I> BIOSYNTHESIS%BIOCYC%PWY-841 |
| HALLMARK_MYC_TARGETS_V1%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V1 | HALLMARK_MTORC1_SIGNALING%MSIGDBHALLMARK%HALLMARK_MTORC1_SIGNALING | HALLMARK_MYC_TARGETS_V2%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V2 | PID_FANCONI_PATHWAY%MSIGDB_C2%PID_FANCONI_PATHWAY |
| HALLMARK_MYC_TARGETS_V2%MSIGDBHALLMARK%HALLMARK_MYC_TARGETS_V2 | HALLMARK_ESTROGEN_RESPONSE_LATE%MSIGDBHALLMARK%HALLMARK_ESTROGEN_RESPONSE_LATE | PID_FANCONI_PATHWAY%MSIGDB_C2%PID_FANCONI_PATHWAY | PID_AURORA_B_PATHWAY%MSIGDB_C2%PID_AURORA_B_PATHWAY |
| AEROBIC RESPIRATION I (CYTOCHROME C)%BIOCYC%PWY-3781 | HALLMARK_OXIDATIVE_PHOSPHORYLATION%MSIGDBHALLMARK%HALLMARK_OXIDATIVE_PHOSPHORYLATION | PID_AURORA_B_PATHWAY%MSIGDB_C2%PID_AURORA_B_PATHWAY | PID_FOXM1_PATHWAY%MSIGDB_C2%PID_FOXM1_PATHWAY |
# using set operations to get unique positive genes for each protocol
# non for all because it's a combined analysis
# for dual
unique_dual_names_pos = setdiff(dual_gsea_pos$NAME,
union(triple_gsea_pos$NAME,
union(IF_gsea_pos$NAME,
all_gsea_pos$NAME)))
unique_dual_pos <- dual_gsea_pos[dual_gsea_pos$NAME %in% unique_dual_names_pos, ]
# for triple
unique_triple_names_pos = setdiff(triple_gsea_pos$NAME,
union(dual_gsea_pos$NAME,
union(IF_gsea_pos$NAME,
all_gsea_pos$NAME)))
unique_triple_pos <- triple_gsea_pos[triple_gsea_pos$NAME %in% unique_triple_names_pos, ]
# for IF
unique_IF_names_pos = setdiff(IF_gsea_pos$NAME,
union(dual_gsea_pos$NAME,
union(triple_gsea_pos$NAME,
all_gsea_pos$NAME)))
unique_IF_pos <- IF_gsea_pos[IF_gsea_pos$NAME %in% unique_IF_names_pos, ]
# using set operations to get unique negative genes for each protocol
# non for all because it's a combined analysis
# for dual
unique_dual_names_neg = setdiff(dual_gsea_neg$NAME,
union(triple_gsea_neg$NAME,
union(IF_gsea_neg$NAME,
all_gsea_neg$NAME)))
unique_dual_neg <- dual_gsea_neg[dual_gsea_neg$NAME %in% unique_dual_names_neg, ]
# for triple
unique_triple_names_neg = setdiff(triple_gsea_neg$NAME,
union(dual_gsea_neg$NAME,
union(IF_gsea_neg$NAME,
all_gsea_neg$NAME)))
unique_triple_neg <- triple_gsea_neg[triple_gsea_neg$NAME %in% unique_triple_names_neg, ]
# for IF
unique_IF_names_neg = setdiff(IF_gsea_neg$NAME,
union(dual_gsea_neg$NAME,
union(triple_gsea_neg$NAME,
all_gsea_neg$NAME)))
unique_IF_neg <- IF_gsea_neg[IF_gsea_neg$NAME %in% unique_IF_names_neg, ]
# Table displaying a summary of positive and unqiue pathways
unique_pos_summary_table <- data.frame(dual = unique_dual_pos$NAME[1:15],
IF = unique_IF_pos$NAME[1:15],
Triple = unique_triple_pos$NAME[1:15])
# Table displaying a summary of negative and unqiue pathways
unique_neg_summary_table <- data.frame(dual = unique_dual_neg$NAME[1:15],
IF = unique_IF_neg$NAME[1:15],
Triple = unique_triple_neg$NAME[1:15])
# Comparison with tables pulled from gProfiler analysis in assignment 2
kableExtra::kable_styling(knitr::kable(unique_pos_summary_table,
caption = "Top unique positive pathways (GSEA)"),
font_size = 10)
| dual | IF | Triple |
|---|---|---|
| PID_RXR_VDR_PATHWAY%MSIGDB_C2%PID_RXR_VDR_PATHWAY | SIGNALING BY NTRK2 (TRKB)%REACTOME DATABASE ID RELEASE 65%9006115 | REELIN SIGNALING PATHWAY%PATHWAY INTERACTION DATABASE NCI-NATURE CURATED DATA%REELIN SIGNALING PATHWAY |
| LOSS OF PROTEINS REQUIRED FOR INTERPHASE MICROTUBULE ORGANIZATION FROM THE CENTROSOME%REACTOME%R-HSA-380284.3 | JOUBERT SYNDROME%WIKIPATHWAYS_20240210%WP4656%HOMO SAPIENS | BETA1 ADRENERGIC RECEPTOR SIGNALING PATHWAY%PANTHER PATHWAY%P04377 |
| THROMBOXANE SIGNALLING THROUGH TP RECEPTOR%REACTOME%R-HSA-428930.4 | PROTEIN LOCALIZATION TO ENDOPLASMIC RETICULUM%GOBP%GO:0070972 | BETA2 ADRENERGIC RECEPTOR SIGNALING PATHWAY%PANTHER PATHWAY%P04378 |
| COPI-DEPENDENT GOLGI-TO-ER RETROGRADE TRAFFIC%REACTOME DATABASE ID RELEASE 65%6811434 | REGULATION OF PROTEIN LOCALIZATION TO CELL SURFACE%GOBP%GO:2000008 | ENKEPHALIN RELEASE%PANTHER PATHWAY%P05913 |
| ANCHORING OF THE BASAL BODY TO THE PLASMA MEMBRANE%REACTOME DATABASE ID RELEASE 65%5620912 | LYSOSOME ORGANIZATION%GOBP%GO:0007040 | HETEROTRIMERIC G-PROTEIN SIGNALING PATHWAY-GI ALPHA AND GS ALPHA MEDIATED PATHWAY%PANTHER PATHWAY%P00026 |
| THROMBIN SIGNALLING THROUGH PROTEINASE ACTIVATED RECEPTORS (PARS)%REACTOME%R-HSA-456926.4 | AUTOPHAGOSOME ORGANIZATION%GOBP%GO:1905037 | HISTAMINE H2 RECEPTOR MEDIATED SIGNALING PATHWAY%PANTHER PATHWAY%P04386 |
| WNT LIGAND BIOGENESIS AND TRAFFICKING%REACTOME DATABASE ID RELEASE 65%3238698 | SPERMATOGENESIS%GOBP%GO:0007283 | METABOTROPIC GLUTAMATE RECEPTOR GROUP II PATHWAY%PANTHER PATHWAY%P00040 |
| RECRUITMENT OF MITOTIC CENTROSOME PROTEINS AND COMPLEXES%REACTOME%R-HSA-380270.4 | POSITIVE REGULATION OF CARTILAGE DEVELOPMENT%GOBP%GO:0061036 | METABOTROPIC GLUTAMATE RECEPTOR GROUP I PATHWAY%PANTHER PATHWAY%P00041 |
| CENTROSOME MATURATION%REACTOME%R-HSA-380287.4 | LYTIC VACUOLE ORGANIZATION%GOBP%GO:0080171 | MUSCARINIC ACETYLCHOLINE RECEPTOR 2 AND 4 SIGNALING PATHWAY%PANTHER PATHWAY%P00043 |
| LOSS OF NLP FROM MITOTIC CENTROSOMES%REACTOME%R-HSA-380259.3 | PHOSPHOLIPID BIOSYNTHETIC PROCESS%GOBP%GO:0008654 | SYNAPTIC_VESICLE_TRAFFICKING%PANTHER PATHWAY%P05734 |
| RECRUITMENT OF NUMA TO MITOTIC CENTROSOMES%REACTOME%R-HSA-380320.5 | GLYCEROPHOSPHOLIPID METABOLIC PROCESS%GOBP%GO:0006650 | ACTIVATION OF GABAB RECEPTORS%REACTOME DATABASE ID RELEASE 65%991365 |
| DIACYLGLYCEROL METABOLIC PROCESS%GOBP%GO:0046339 | GLYCEROPHOSPHOLIPID BIOSYNTHETIC PROCESS%GOBP%GO:0046474 | ION HOMEOSTASIS%REACTOME DATABASE ID RELEASE 65%5578775 |
| RETINA DEVELOPMENT IN CAMERA-TYPE EYE%GOBP%GO:0060041 | POSTSYNAPTIC SIGNAL TRANSDUCTION%GOBP%GO:0098926 | RAS ACTIVATION UPON CA2+ INFLUX THROUGH NMDA RECEPTOR%REACTOME DATABASE ID RELEASE 65%442982 |
| SENSORY SYSTEM DEVELOPMENT%GOBP%GO:0048880 | NA | G PROTEIN GATED POTASSIUM CHANNELS%REACTOME DATABASE ID RELEASE 65%1296059 |
| SPINAL CORD DEVELOPMENT%GOBP%GO:0021510 | NA | NEGATIVE REGULATION OF NMDA RECEPTOR-MEDIATED NEURONAL TRANSMISSION%REACTOME DATABASE ID RELEASE 65%9617324 |
kableExtra::kable_styling(knitr::kable(GO_upregulated[1:15,1:3],
caption = "Top unique positive pathways (gProfiler)"),
font_size = 10)
| Dual | IF | Triple |
|---|---|---|
| Notch signaling pathway | glycoprotein metabolic process | dendrite extension |
| negative regulation of growth | negative regulation of fat cell differentiation | regulation of G protein-coupled receptor signaling pathway |
| nerve development | tissue morphogenesis | regulation of neuron projection arborization |
| cranial nerve development | embryonic heart tube morphogenesis | calcium ion import across plasma membrane |
| marginal zone B cell differentiation | determination of digestive tract left/right asymmetry | response to acetylcholine |
| cell migration in hindbrain | epithelial tube morphogenesis | cellular response to acetylcholine |
| autonomic nervous system development | regulation of neuroblast proliferation | G protein-coupled acetylcholine receptor signaling pathway |
| retina layer formation | regulation of chondrocyte differentiation | acetylcholine receptor signaling pathway |
| cellular response to retinoic acid | ossification | calcium ion import into cytosol |
| regulation of bone resorption | melanin metabolic process | inhibitory synapse assembly |
| cerebellum development | melanin biosynthetic process | regulation of calcium ion transmembrane transport |
| metencephalon development | embryonic epithelial tube formation | protein depolymerization |
| ventricular cardiac muscle cell membrane repolarization | neural crest cell development | regulation of dendrite extension |
| cell-cell junction maintenance | morphogenesis of an epithelium | cellular response to metal ion |
| regulation of mesenchymal stem cell differentiation | phenol-containing compound biosynthetic process | cellular response to brain-derived neurotrophic factor stimulus |
kableExtra::kable_styling(knitr::kable(unique_neg_summary_table,
caption = "Top unique negative pathways (GSEA)"),
font_size = 10)
| dual | IF | Triple |
|---|---|---|
| COMPLEX I BIOGENESIS%REACTOME%R-HSA-6799198.3 | EPIGENETIC REGULATION OF GENE EXPRESSION%REACTOME%R-HSA-212165.5 | HALLMARK_IL2_STAT5_SIGNALING%MSIGDBHALLMARK%HALLMARK_IL2_STAT5_SIGNALING |
| FORMATION OF THE CORNIFIED ENVELOPE%REACTOME DATABASE ID RELEASE 65%6809371 | TRNA MODIFICATION IN THE NUCLEUS AND CYTOSOL%REACTOME DATABASE ID RELEASE 65%6782315 | SUPERPATHWAY OF PURINE NUCLEOTIDE SALVAGE%BIOCYC%PWY66-409 |
| MAMMARY GLAND DEVELOPMENT PATHWAY PREGNANCY AND LACTATION STAGE 3 OF 4 %WIKIPATHWAYS_20240210%WP2817%HOMO SAPIENS | MITOTIC G2-G2 M PHASES%REACTOME%R-HSA-453274.4 | SUPERPATHWAY OF METHIONINE DEGRADATION%BIOCYC%PWY-5328 |
| ENERGY DERIVATION BY OXIDATION OF ORGANIC COMPOUNDS%GOBP%GO:0015980 | POLYMERASE SWITCHING ON THE C-STRAND OF THE TELOMERE%REACTOME DATABASE ID RELEASE 65%174411 | EGFR1%IOB%EGFR1 |
| CELLULAR RESPIRATION%GOBP%GO:0045333 | NUCLEOTIDE EXCISION REPAIR%REACTOME DATABASE ID RELEASE 65%5696398 | FAS%IOB%FAS |
| MITOCHONDRIAL RESPIRATORY CHAIN COMPLEX ASSEMBLY%GOBP%GO:0033108 | G2 M TRANSITION%REACTOME DATABASE ID RELEASE 65%69275 | PID_SYNDECAN_4_PATHWAY%MSIGDB_C2%PID_SYNDECAN_4_PATHWAY |
| TRNA MODIFICATION%GOBP%GO:0006400 | BASE EXCISION REPAIR%REACTOME DATABASE ID RELEASE 65%73884 | PID_INTEGRIN3_PATHWAY%MSIGDB_C2%PID_INTEGRIN3_PATHWAY |
| POSITIVE REGULATION OF HYDROLASE ACTIVITY%GOBP%GO:0051345 | SUMO E3 LIGASES SUMOYLATE TARGET PROTEINS%REACTOME%R-HSA-3108232.8 | HIF-1-ALPHA TRANSCRIPTION FACTOR NETWORK%PATHWAY INTERACTION DATABASE NCI-NATURE CURATED DATA%HIF-1-ALPHA TRANSCRIPTION FACTOR NETWORK |
| AEROBIC RESPIRATION%GOBP%GO:0009060 | PRADER WILLI AND ANGELMAN SYNDROME%WIKIPATHWAYS_20240210%WP3998%HOMO SAPIENS | EGFR1%NETPATH%EGFR1 |
| PROTON MOTIVE FORCE-DRIVEN MITOCHONDRIAL ATP SYNTHESIS%GOBP%GO:0042776 | SISTER CHROMATID COHESION%GOBP%GO:0007062 | ARGININE AND PROLINE METABOLISM%SMPDB%SMP0000020 |
| POSITIVE REGULATION OF RESPONSE TO BIOTIC STIMULUS%GOBP%GO:0002833 | NUCLEOSOME ASSEMBLY%GOBP%GO:0006334 | PROLIDASE DEFICIENCY (PD)%PATHWHIZ%PW000083 |
| POSITIVE REGULATION OF INTRACELLULAR SIGNAL TRANSDUCTION%GOBP%GO:1902533 | MITOTIC CHROMOSOME CONDENSATION%GOBP%GO:0007076 | ARGININE: GLYCINE AMIDINOTRANSFERASE DEFICIENCY (AGAT DEFICIENCY)%PATHWHIZ%PW000084 |
| REGULATION OF PROTEIN MATURATION%GOBP%GO:1903317 | NEGATIVE REGULATION OF TELOMERE MAINTENANCE VIA TELOMERE LENGTHENING%GOBP%GO:1904357 | HYPERPROLINEMIA TYPE II%SMPDB%SMP0000360 |
| REGULATION OF INSULIN RECEPTOR SIGNALING PATHWAY%GOBP%GO:0046626 | SPLICEOSOMAL COMPLEX ASSEMBLY%GOBP%GO:0000245 | HYPERPROLINEMIA TYPE I%SMPDB%SMP0000361 |
| NA | MRNA TRANSPORT%GOBP%GO:0051028 | PROLINEMIA TYPE II%PATHWHIZ%PW000087 |
kableExtra::kable_styling(knitr::kable(GO_downregulated[1:15,1:3],
caption = "Top unique negative pathways (gProfiler)"),
font_size = 10)
| Dual | IF | Triple |
|---|---|---|
| cell junction assembly | DNA replication, removal of RNA primer | regulation of apoptotic signaling pathway |
| regulation of ventricular cardiac muscle cell action potential | positive regulation of nuclease activity | apoptotic signaling pathway |
| phosphatidylinositol 3-kinase/protein kinase B signal transduction | postreplication repair | antigen processing and presentation of peptide antigen |
| epithelial tube morphogenesis | positive regulation of deoxyribonuclease activity | nucleoside catabolic process |
| cell-substrate junction assembly | cell division | sulfur compound metabolic process |
| positive regulation of extrinsic apoptotic signaling pathway | mRNA pseudouridine synthesis | positive regulation of programmed cell death |
| regulation of endocrine process | negative regulation of mammary gland epithelial cell proliferation | positive regulation of apoptotic process |
| ameboidal-type cell migration | tRNA pseudouridine synthesis | gland morphogenesis |
| stress fiber assembly | pyrimidine nucleoside monophosphate biosynthetic process | positive regulation of cytokine production |
| contractile actin filament bundle assembly | DNA strand invasion | carbohydrate derivative catabolic process |
| pteridine-containing compound biosynthetic process | olfactory lobe development | hepatocyte proliferation |
| cellular response to peptide hormone stimulus | protein-DNA complex assembly | epithelial cell proliferation involved in liver morphogenesis |
| positive regulation of lipid transport | pyrimidine-containing compound metabolic process | liver morphogenesis |
| negative regulation of wound healing | regulation of cyclin-dependent protein kinase activity | nucleobase-containing small molecule catabolic process |
| cell-substrate junction organization | transposition | nucleotide-sugar biosynthetic process |
# Table of nodes and edges data
nodes_edges <- data.frame(`Neural Organoids` = c("307","1914"),
Dual = c("29","36"),
Triple = c("260","432"),row.names = c("Nodes","Edges"))
kableExtra::kable_minimal(knitr::kable(nodes_edges,
caption = "Nodes and Edges Summary"))
| Neural.Organoids | Dual | Triple | |
|---|---|---|---|
| Nodes | 307 | 29 | 260 |
| Edges | 1914 | 36 | 432 |
All maps were created using P-value < 0.05 and FDR/Q-value < 0.05.
Neural Organoid initial network
Figure 1A: Neural organoid initial network with 307 nodes and 1914
edges.
Dual initial network Figure 1B:
Dual inhibition initial network with 29 nodes and 36 edges.
Triple initial network Figure
1C: Triple inhibition initial network with 260 nodes and 432 edges.
# Table of nodes and edges data
annotate_params <- data.frame(`Parameter` = c("Wordcloud: Adjacent Word (default)","3","1","8","TRUE"),
row.names =c("Label Algorithm:","Max words per label:","Minimum word occurrence:","Adjacent word bonus:","Layout network to prevent cluster overlap:"))
kableExtra::kable_minimal(knitr::kable(annotate_params ,
caption = "Autoannotate parameters"))
| Parameter | |
|---|---|
| Label Algorithm: | Wordcloud: Adjacent Word (default) |
| Max words per label: | 3 |
| Minimum word occurrence: | 1 |
| Adjacent word bonus: | 8 |
| Layout network to prevent cluster overlap: | TRUE |
Neural Organoid autoannotated network Figure 2A: Neural
Organoid autoannotated network
Dual autoannotated network Figure
2B: Dual inhibition autoannotated network.
Triple autoannotated network
Figure 2C: Triple inhibition autoannotated network.
Neural Organoid publication ready figure Figure 3A: Neural
organoids publication ready network.
Dual publication ready figure
Figure 3B: Dual inhibition publication ready network.
Triple publication ready figure
Figure 3C: Triple inhibition publication ready network.
Neural Organoid summary network
The neural organoid summary network had main upregulated networks related to: WNT signaliing, positive development neurogenesis (PDN), anatomical morphogenesis neuron (AMN), modulation of excitatory potential (MEP) and microtubule processes. This fits with the model for several reasons. The PDN, AMN and and MEP networks are expected in a neural organoid given that they are standard aspects of neurodevelopment. WNT signalling is likely active in many aspects of development in a non-neural specifi way, but it is also well documented in brain development. Microtubule processes are more broad, however, studies have shown that cytoskeletal reorganization plays an important role in neural differentiation and lineage commitment.
The downregulated genes appear to relate to immune function and cell division (post-mitotic degradation, sister chromatid segregation, atr replication stress). Immune function is interesting because studies have shown that immune activation during neurodevelopment can be damaging. It is reasonable such processes would be downregulated. The downregulation of mitotic activity is an observed phenomenon which occurs at some stages of neurodevelopment. It is hypothesized that this downregulation enables cytoskeletal processes to act on neuronal growth and extension instead.
Figure 4A: Neural Organoid
summary network.
Dual summary network
The dual inhibition results are more puzzling. This is given that this network excludes all common neurdevelopment networks with other protocols. The downregulationg of positive immune response may be expected given the previous logic relating to immune activing during the brain. However, negative regulation of cell differentiation might hunt that the dual inhibited organoids are in a more progenitor stage and don’t differentiate as readily. Additionally, the eye development node may indicate non-commitment to a brain neural lineage. This could support the lack of differentiation.
Figure
4B: Dual inhibition summary network.
Triple summary network
The triple network has some interesting nodes which differentiate it from other protocols. A lot of the positive nodes such as dendrite organization, nervous growth and nerve growth factor might indicate that these organoids have more mature phenotype which is undergoing more neurogenesis. This is in comparison to Dual inhibition which seemed to be in a more progenitor stage. Meanwhile budding host complex, glutamate release cycle and activation nmda, are all nodes relating to excitatory potential. These two facts combined may indicate that these organoids are differentiating, forming synapses and using neurotransmitters. Interestingly, there was many negatively regulated nodes related to metabolism such as glycolytic carbohydrates, regulation of lipid transport and purine nucleoside deficiency. Although this is not well characterized, some studies have shown that stem cells have differing and sometimes higher metabolic than neurons. Overall, many of the negative nodes are uncharacterized and maybe me goodareas of novel investigation.
Figure 4C: Triple inhibition summary network.
My conclusion in assignment 2 was that yes, the thresholded analysis does yield some similarity to the paper. This is because. there was an alignment between the heterogeneity in organoids and WNT/Notch signaling. The GSEA non-thresholded analysis does seem to corroborate the overall issue with organoid stochasticity. There was no clear delineation or pathways immediately obvious from the data which might indicate that even dual and triple inhibition protocols are fairly unguided and involve plenty of self assembly. However, as noted in a previous section, my thresholded and non-thresholded analyses were quite different with no direct comparisons.
One insight that does corroborate information from the paper is the observation that triple inhibited organoids may be in a more in a more developed state. In the original paper, these organoids highly expressed genes for the neocortex and medial pallium while not expressing as many genes for neural stem cells. Alternatively, the paper showed that the dual inhibition protocol had much higher neural stem cell gene expression and much lower regional (ex neocortex, pallium) expression. These insights could line up with our observations that triple-i nodes were more focused on specified neural action while dual-i nodes were more stem like.
More broadly, the combined protocol all neural organoid analysis does corroborate well with the paper. The developing neural organoids, which rely heavily on modulaitng cytoarchitectural features, did have upregulation of cytoskeletal and microtubule processes in my analysis.
There is some evidence supporting my observed results. Firstly, the observation that the triple-i protocol might be more differentiated than the dual-i protocol does line up with previous observations regarding their use. Triple-i protocols are directed at outer radial glia (oRGs) which are bonafide neural precursors that generate the majority of the neural growth and diversity. As such, my results showing the differentiation of the triple-i organoids could tie into that function. Meanwhile dual protocols which are less differentiated much more closely resemble ventral radial glia (vRG) which are also neurogenic but to a lesser extent.
Additionally, observations around the important of cytoskeletal and microtubule signalling in organoid formation are corroborated. Multiple studies have shown that neural precursors found in organoids may rely on cytoskeletal signalling to establish a niche. Similarly, these cytoskeletal regulation genes may be important for cell fate and decision making during development.
I performed a post-analysis using the bader lab transcription factor (TF) gene set. I specifically looked for TFs which are either 1. of interest in oRG biology during neurodevelopment. 2. linked with neurodevelopmental disorders. I used a Mann-Whitney test thresholded at p-value < 0.05
I found that FOXG1, which is an important marker for telencephalon development was linked with several upregulated nodes including neural prjection. Meanwhile FOXO1, which is an important regultor of neurodevelopment was also linked with neuronal morphogenesis, neurogenesis and synapse formation. Knockouts of FOXO TFs have shown to cause several disease phenotypes such as degeneration. Unfortunately, neither of those genes were picked up in the unique analysis, likely due to to it’s limited nature.
Surprisingly, I did not find a strong link between the any analysis and PAX6, which is a strong regulator of neural precursor. It only somewhat interacted with microtubule processes and proteasome degradation. The triple-i protocol had some links with NKX2.5, which is thought to be the upstream TF controlling HOPX expression. Whereby HOPX is known to be the most established marker of oRGs. This may support the role of the triple-i protocol in oRG generation.
Neural Organoid transcription factor post-analysis
network
Dual transcription factor post-analysis network
Triple transcription factor post-analysis network
Summary Enrichment Comparison Q3
Interpretation for Original paper Q1
Rosebrock D, Arora S, Mutukula N, Volkman R, Gralinska E, Balaskas A, et al. Enhanced cortical neural stem cell identity through short SMAD and WNT inhibition in human cerebral organoids facilitates emergence of outer radial glial cells. Nat Cell Biol. 2022;24(6):981–95.
Pollen AA, Nowakowski TJ, Chen J, Retallack H, Sandoval-Espinosa C, Nicholas CR, et al. Molecular Identity of Human Outer Radial Glia During Cortical Development. Cell. 2015 Sep 24;163(1):55–67.
Kriegstein A, Noctor S, Martínez-Cerdeño V. Patterns of neural stem and progenitor cell division may underlie evolutionary cortical expansion. Nat Rev Neurosci. 2006 Nov;7(11):883–90.
Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods. 2015 Feb;12(2):115–21.
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010 Jan 1;26(1):139–40.
Korotkevich G, Sukhov V, Sergushichev A (2019). “Fast gene set enrichment analysis.” bioRxiv.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research. 2003;13(11):2498–504.
Merico D, Isserlin R, Stueker O, Emili A, Bader GD (2010) Enrichment Map: A Network-Based Method for Gene-Set Enrichment Visualization and Interpretation. PLoS ONE 5(11): e13984. https://doi.org/10.1371/journal.pone.0013984
Kucera M, Isserlin R, Arkhangorodsky A, Bader GD. AutoAnnotate: A Cytoscape app for summarizing networks with semantic annotations. F1000Res. 2016 Jul 15;5:1717.
Oesper, L., Merico, D., Isserlin, R., & Bader, G. D. (2011). WordCloud: a Cytoscape plugin to create a visual semantic summary of networks. Source Code for Biology and Medicine, 6(1), 7.
Assenov Y, Ramírez F, Schelhorn SE, Lengauer T, Albrecht M. Computing topological parameters of biological networks. Bioinformatics. 2008 Jan 15;24(2):282-4. doi: 10.1093/bioinformatics/btm554. Epub 2007 Nov 15. PMID: 18006545.
Utriainen, M., Morris, J.H. clusterMaker2: a major update to clusterMaker, a multi-algorithm clustering app for Cytoscape. BMC Bioinformatics 24, 134 (2023).
Kolberg L, Raudvere U, Kuzmin I, Adler P, Vilo J, Peterson H. g:Profiler-interoperable web service for functional enrichment analysis and gene identifier mapping (2023 update). Nucleic Acids Res. 2023 Jul 5;51(W1):W207–12. Methods. 2015;12(2):115–121.
Faissner A, Reinhard J. The extracellular matrix compartment of neural stem and glial progenitor cells. Glia. 2015 Aug;63(8):1330-49. doi: 10.1002/glia.22839. Epub 2015 Apr 22. PMID: 25913849.
Monet, M. C., & Quan, N. (2023). Complex Neuroimmune Involvement in Neurodevelopment: A Mini-Review. Journal of Inflammation Research, 16, 2979–2991.
Jády AG, Nagy ÁM, Kőhidi T, Ferenczi S, Tretter L, Madarász E. Differentiation-Dependent Energy Production and Metabolite Utilization: A Comparative Study on Neural Stem Cells, Neurons, and Astrocytes. Stem Cells Dev. 2016 Jul 1;25(13):995-1005.
Kawaguchi A (2021) Neuronal Delamination and Outer Radial Glia Generation in Neocortical Development. Front. Cell Dev. Biol. 8:623573.
Miranda-Negrón Y and García-Arrarás JE (2022) Radial glia and radial glia-like cells: Their role in neurogenesis and regeneration. Front. Neurosci. 16:1006037.
Hettige NC, Ernst C. FOXG1 Dose in Brain Development. Front Pediatr. 2019 Nov 22;7:482.
Santo EE, Paik J. FOXO in Neural Cells and Diseases of the Nervous System. Curr Top Dev Biol. 2018;127:105-118.
Götz M, Stoykova A, Gruss P. Pax6 controls radial glia differentiation in the cerebral cortex. Neuron. 1998 Nov;21(5):1031-44.
Uhlén M et al., Tissue-based map of the human proteome. Science (2015) PubMed: 25613900 DOI: 10.1126/science.1260419
Caspa Gokulan R, Yap LF, Paterson IC. HOPX: A Unique Homeodomain Protein in Development and Tumor Suppression. Cancers (Basel). 2022 Jun 2;14(11):2764.
Kochinke K, Zweier C, Nijhof B, Fenckova M, Cizek P, Honti F, Keerthikumar S, Oortveld MA, Kleefstra T, Kramer JM, Webber C, Huynen MA, Schenck A. Systematic Phenomics Analysis Deconvolutes Genes Mutated in Intellectual Disability into Biologically Coherent Modules. Am J Hum Genet. 2016 Jan 7;98(1):149-64. doi: 10.1016/j.ajhg.2015.11.024. PMID: 26748517; PMCID: PMC4716705.